Signal processing method and corresponding encoding method and device

ABSTRACT

The invention relates to a method of defining a new set of codewords for use in a variable length coding algorithm, and to a data encoding method using such a code. Said coding method comprises at least the steps of applying to said data a transform and coding the obtained coefficients by means of the variable length coding algorithm. The code used in said algorithm is built with the same length distribution as the binary Huffman code distribution, and is constructed by implementation of specific steps: (a) creating a synchronization tree structure of the codes with decreasing depths for each elementary branch of said tree, with initialized parameters D=l max , K=n lmax /2, and current l=l cur =l max , (D and K being integers representing respectively the maximum length of a string of zeros and the maximum length of a string of ones, l max  the greatest codeword length, and n lmax  the number of codewords of length l max  in the Huffman code); (b) for each length l cur  beginning from l max , if n′ lcur ≠n lcur , using the codeword l k  as prefix and anchor to it the maximal size elementary branch of depth D′=l cur −K; (c) if l k  cannot be used as prefix, find a suitable prefix by choosing the minimal length codeword that is in excess with respect to the desired distribution.

FIELD OF THE INVENTION

The present invention generally relates to the field of data compressionand, more specifically, to a method of processing digital signal forreducing the amount of data used to represent them.

The invention also relates to a method of encoding digital signals thatincorporates said signal processing method, and to a correspondingencoding device.

BACKGROUND OF THE INVENTION

Variable length codes, such as described for example in the documentU.S. Pat. No. 4,316,222, are used in many fields like video coding, inorder to digitally encode symbols which have unequal probabilities tooccur: words with high probabilities are assigned short binarycodewords, while those with low probabilities are assigned longcodewords. These codes however suffer from the drawback of being verysusceptible to errors such as inversions, deletions, insertions, etc . .. , with a resulting loss of synchronization (itself resulting in anerror state) which leads to extended errors in the decoded bitstream.Many words are indeed possibly decoded incorrectly as transmissioncontinues.

How quickly a decoder may recover synchronization from an error state isthe error span, i.e. the average number of symbols decoded untilre-synchronization: $\begin{matrix}{E_{s} = {\sum\limits_{k = I}{P_{C_{k}}^{err} \times N_{k}}}} & (1)\end{matrix}$where I is the set of the codeword indexes, P^(err) _(C) _(k) is theprobability of the erroneous symbol to be C_(k), and N_(k) is theaverage number of symbols to be decoded until synchronization when thecorrupted symbol is C_(k). For a code well matched to the sourcestatistics, the probability of a codeword C_(k) can be approximated byP_(C) _(k) =2^(−l) _(k), where l_(k) is the length of C_(k), and theprobability of the erroneous symbol to be C_(k) can be approximated byP^(err) _(C) _(k) =2^(−l) _(k)×(l_(k)/l), where l is the average lengthof the code. The expression of E_(s) then becomes: $\begin{matrix}{E_{s} = {\sum\limits_{k \in I}{2^{- \ell_{k}} \times \frac{\ell_{k}}{\ell} \times N_{k}}}} & (2)\end{matrix}$According to said expression, the most probable symbols have a greaterimpact on E_(s), and their contribution will therefore be minimized. Forthis purpose, the following family F of variable length codes is defined(expression (3)): $\begin{matrix}{F\left\{ \begin{matrix}\begin{Bmatrix}1_{i} & 0_{j} & 1\end{Bmatrix} & {{{for}\quad i} \in {\left\lbrack {0,{K - 1}} \right\rbrack\quad{and}\quad j} \in \left\lbrack {1,{D - 1}} \right\rbrack} \\\begin{Bmatrix}1_{i} & 0_{D}\end{Bmatrix} & {{{for}\quad i} \in \left\lbrack {0,{K - 1}} \right\rbrack} \\1_{k} & \quad\end{matrix} \right.} & (3)\end{matrix}$where 1_(i) and 0_(i) represent i-length strings of ones and zeros and Dand K are arbitrary integers with K≦D (an example of tree structure forsuch a fast synchronizing code with( D, K)=(4, 3) is given in FIG. 1, inwhich the black circles correspond to codewords and the white circles toerror states). Assuming that D and K are large enough, the most probable(MP) codewords, i.e. the shortest ones, belong to the subset C_(MP) ofthe family F: $\begin{matrix}{C_{MP} = \begin{Bmatrix}1_{i} & 0_{j} & 1\end{Bmatrix}_{i \in {{\lbrack{0,{k - 1}}\rbrack}\quad j} \in {\lbrack{1,{D - 1}}\rbrack}}} & (4)\end{matrix}$On these codewords, several types of error positions are possible(transformation of the original codeword into one valid codeword, intothe concatenation of two valide codewords, into an error state, or intothe concatenation of a valid codeword and an error state). Consideringthat the recovery from an error state ES_(k) resulting from an erroneouscodeword C_(k) also depends on the codeword C_(h) following the errorstate, it can then be shown that, for any error state such as(l_(k)+l_(h)<D and C_(h)≠1_(k)), the resulting approximate error spanE_(s) is bounded (assuming that D and K are large enough), and that thesynchronization is always recovered after decoding C_(h).

However, in spite of this recovery performance, such a structure is farfrom optimal average length and moreover does not reach every possiblecompression, and hence it cannot be applied to any given source.

SUMMARY OF THE INVENTION

It is therefore an object of the invention to propose a processingmethod in which the operation of defining a set of codewords avoidsthese limitations.

To this end, the invention relates to a method of processing digitalsignals for reducing the amount of data used to represent said digitalsignals and forming by means of a variable length coding step a set ofcodewords such that the more frequently occurring values of digitalsignals are represented by shorter code lengths and the less frequentlyoccurring values by longer code lengths, said variable length codingstep including a defining sub-step for generating said set of codewordsand in which the code used is built with the same length distributionL′=(n′_(i)) [i=1, 2 . . . , l_(max)] as the binary Huffman codedistribution L=(n_(i)) [i=1, 2 . . . , l_(max)], n_(i) being the numberof codewords of length i, and constructed by implementation of thefollowing steps:

-   -   (a) creating a synchronization tree structure of the code with        decreasing depths for each elementary branch of said tree, with        initialized parameters D=l_(max), K=n_(lmax)/2, and current        l=l_(cur)=l_(max), the notations being:        -   D=arbitrary integer representing the maximum length of a            string of zeros;        -   l_(max)=the greatest codeword length;        -   K=arbitrary integer representing the maximum length of a            string of ones;        -   n_(lmax)=number of codewords of length l_(max) in the            Huffman code;    -   (b) for each length l_(cur) beginning from l_(max), if        n′_(lcur)≠n_(lcur), using the codeword 1_(k) as prefix and        anchor to it the maximal size elementary branch of depth        D′=l_(cur)−K;    -   (c) if 1_(k) cannot be used as prefix, find a suitable prefix by        choosing the minimal length codeword that is in excess with        respect to the desired distribution.

It is another object of the invention to propose a method of encodingdigital signals incorporating said processing method.

To this end, the invention relates to a method of encoding digitalsignals comprising at least the steps of applying to said digital signalan orthogonal transformation producing a plurality of coefficients,quantizing said coefficients and coding the quantized coefficients bymeans of a variable length coding step in which the more frequentlyoccurring values are represented by shorter code lengths and the lessfrequently occurring values by longer code lengths, said variable lengthcoding step including a defining sub-step for generating a set ofcodewords corresponding to said digital signals and in which the codeused is built with the same length distribution L′=(n′_(i)) [i=1, 2 . .. , l_(max)] as the binary Huffman code distribution L=(n_(i)) [i=1, 2 .. . , l_(max)], n_(i) being the number of codewords of length i, and isconstructed by implementation of the following steps:

-   -   (a) creating a synchronization tree structure of the code with        decreasing depths for each elementary branch of said tree, with        initialized parameters D=l_(max), K=n_(lmax)/2 and current        l=l_(cur)=l_(max), the notations being:        -   D=arbitrary integer representing the maximum length of a            string of zeros;        -   l_(max)=the greatest codeword length;        -   K=arbitrary integer representing the maximum length of a            string of ones;        -   n_(lmax)=number of codewords of length l_(max) in the            Huffman code;    -   (b) for each length called l_(cur) beginning from l_(max), if        n′_(lcur)≠n_(lcur), using the codeword 1_(k) as prefix and        anchor to it the maximal size elementary branch of depth        D′=l_(cur)−K;    -   (c) if 1_(k) cannot be used as prefix, find a suitable prefix by        choosing the minimal length codeword that is in excess with        respect to the desired distribution.

It is still another object of the invention to propose an encodingdevice corresponding to said encoding method.

To this end, the invention relates to a device for encoding digitalsignals, said device comprising at least an orthogonal transform module,applied to said input digital signals for producing a plurality ofcoefficients, a quantizer, coupled to said transform module forquantizing said plurality of coefficients and a variable length coder,coupled to said quantizer for coding said plurality of quantizedcoefficients in accordance with a variable length coding algorithm andgenerating an encoded stream of data bits, said coefficient codingoperation, in which the more frequently occurring values are representedby shorter code lengths and the less frequently occurring values bylonger code lengths, including a defining sub-step for generating a setof codewords corresponding to said digital signals and in which the codeused is built with the same length distribution L′=(n′_(i)) [i=1, 2 . .. , l_(max)] as the binary Huffman code distribution L=(n_(i)) [i=1, 2 .. . , l_(max)], n_(i) being the number of codewords of length i, and isconstructed by implementation of the following steps:

-   -   (a) creating a synchronization tree structure of the code with        decreasing depths for each elementary branch of said tree, with        initialized parameters D=l_(max), K=n_(lmax)/2, and current        l=l_(cur)=l_(max), the notations being:        -   D=arbitrary integer representing the maximum length of a            string of zeros;        -   l_(max)=the greatest codeword length;        -   K=arbitrary integer representing the maximum length of a            string of ones;        -   n_(lmax)=number of codewords of length l_(max) in the            Huffman code;    -   (b) for each length l_(cur) beginning from l_(max), if        n′_(lcur)≠n_(lcur), using the codeword 1_(k) as prefix and        anchor to it the maximal size elementary branch of depth        D′=l_(cur)−K;    -   (c) if 1_(k) cannot be used as prefix, find a suitable prefix by        choosing the minimal length codeword that is in excess with        respect to the desired distribution.

The proposed principle for a new, generic variable length code treestructure, which keeps the optimal distance distribution of the Huffmancode while also offering a noticeable improvement of the error span,performs as well as the solution proposed in the cited document, but fora much smaller complexity, which allows to apply the algorithm accordingto the invention to both short and longer codes, as for example the codeused in the H.263 video coders.

BRIEF DESCRIPTION OF THE DRAWINGS

The invention will now be described in a more detailed manner, withreference to the accompanying drawings in which:

FIG. 1 shows an example of tree structure of a fast synchronizing code;

FIG. 2 gives a flowchart of a synchronization optimization algorithmaccording to the invention;

FIG. 3 is a table illustrating the comparison between the solutionaccording to the invention and the prior art.

DETAILED DESCRIPTION

Since the limitations indicated hereinabove for the structure accordingto the prior art, for the family F of variable length codes, come fromthe fact that the codes are the repetition of K elementary branches ofsame depth D (illustrated in dashed line in FIG. 1), the main idea ofthe invention is to build codes where the different branch sizes mayvary. Let L=(n_(i))_(i=1, 2, . . . , l) _(max) be the binary Huffmancode length distribution, with n_(i) designating the correspondingnumber of codewords of length i and l_(max) the greatest codewordlength, and (by construction) nl_(max) being even. The algorithm givenin the flowchart of FIG. 2 then produces a code with a lengthdistribution L′=(n′_(i))_(i=1, 2 . . . , l) _(max) which is identical toL after implementation of the following main steps:

-   -   creating a synchronization tree with decreasing depths for each        elementary branch (originally, with initialized parameters        D=l_(max), K=n_(lmax)/2, and current l=l_(cur)=l_(max)) in order        to ensure that n′_(lmax)=n_(lmax) (upper part of FIG. 2);    -   for each length l_(cur) beginning from l_(max) and if        n′_(lcur)≠n_(lcur), using the codeword 1_(k) as prefix and        anchoring to said codeword the maximal size elementary branch of        depth D′=l_(cur)−K (in FIG. 2, left loop L1);    -   if 1_(k) cannot be used as prefix (either because l_(cur) is too        small or because using 1_(k) would irreparably deplete the        current length distribution), finding a suitable prefix by        choosing the minimal length codeword that is in excess with        respect to the desired distribution (in FIG. 2, right loop L₂,        in which l_(free) designates, as indicated in FIG. 2, the first        index {i|n_(l)−n′_(l)|<0} previously defined within the loop        L1).

The invention also relates to a method of encoding digital signals thatincorporates a processing method as described above for reducing theamount of data representing input digital signals, said method allowingto generate by means of a variable length coding step a set of codewordssuch that the more frequently occurring values of digital signals arerepresented by shorter code lengths and the less frequently occurringvalues by longer code lengths, said variable length coding stepincluding a defining sub-step for generating said set of codewords andin which the code used is built with the same length distributionL′=(n′_(i)) [i=1, 2 . . . , l_(max)] as the binary Huffman codedistribution L=(n_(i)) [i=1, 2 . . . , l_(max)], n_(i) being the numberof codewords of length i, and is constructed by implementation of thefollowing steps:

-   -   (a) creating a synchronization tree structure of the code with        decreasing depths for each elementary branch of said tree, with        initialized parameters D=l_(max), K=n_(lmax)/2, and current        l=l_(cur)=l_(max), the notations being:        -   D=arbitrary integer representing the maximum length of a            string of zeros;        -   l_(max)=the greatest codeword length;        -   K=arbitrary integer representing the maximum length of a            string of ones;        -   n_(lmax)=number of codewords of length l_(max) in the            Huffman code;    -   (b) for each length l_(cur) beginning from l_(max), if        n′_(lcur)≠n_(lcur), using the codeword 1_(k) as prefix and        anchor to it the maximal size elementary branch of depth        D′=l_(cur)−K;    -   (c) if 1_(k) cannot be used as prefix, find a suitable prefix by        choosing the minimal length codeword that is in excess with        respect to the desired distribution. The invention also relates        to the corresponding encoding device. The results obtained when        implementing said invention are presented in FIG. 3 for two        reference codes as proposed in the document “Error states and        synchronization recovery for variable length codes”, by Y.        Takishima and al., IEEE Transactions on Communications, vol. 42,        No. 2/3/4, February March/April 1994, pp. 783-792, i.e. a code        for motion vectors (table VIII of said document) and the English        alphabet. As it can be seen in the table of FIG. 3, where it        appears that the values of E_(s) are very close to each other in        both situations, the proposed codes perform as well as those        obtained in said document, but are obtained for a much smaller        complexity since the algorithm according to the invention allows        to obtain a limited number of iterations (with respect to said        document, in which the described algorithm undertakes        manipulations on a greater number of branches).

The proposed algorithm is even so simple that it can be applied by handfor relatively short codes, where the fast synchronizing structure isobtained in only three iterations (of the algorithm), or also to longercodes, as for example the 206-symbols variable length code used in anH.263 video codec to encode the DCT coefficients, for which the errorspan is, when using the invention, much smaller than the original onefor the same average length (which means that the decoder wouldstatistically resynchronize one symbol before the current case with thecode according to the present invention, and at no cost in terms ofcoding rate).

1. A method of processing digital signals for reducing the amount ofdata used to represent said digital signals and forming by means of avariable length coding step a set of codewords such that the morefrequently occurring values of digital signals are represented byshorter code lengths and the less frequently occurring values by longercode lengths, said variable length coding step including a definingsub-step for generating said set of codewords and in which the code usedis built with the same length distribution L′=(n′_(i)) [i=1, 2 . . . ,l_(max)] as the binary Huffman code distribution L=(n_(i)) [i=1, 2 . . ., l_(max)], n_(i) being the number of codewords of length i, and isconstructed by implementation of the following steps: (a) creating asynchronization tree structure of the code with decreasing depths foreach elementary branch of said tree, with initialized parametersD=l_(max), K=n_(lmax)/2, and current l=l_(cur)=l_(max), the notationsbeing: D=arbitrary integer representing the maximum length of a stringof zeros; l_(max)=the greatest codeword length; K=arbitrary integerrepresenting the maximum length of a string of ones; n_(lmax)=number ofcodewords of length l_(max) in the Huffman code; (b) for each lengthl_(cur) beginning from l_(max), if n′_(lcur)≠n_(lcur), using thecodeword 1_(k) as prefix and anchor to it the maximal size elementarybranch of depth D′=l_(cur)−K; (c) if 1_(k) cannot be used as prefix,find a suitable prefix by choosing the minimal length codeword that isin excess with respect to the desired distribution.
 2. A method ofencoding digital signals comprising at least the steps of applying tosaid digital signal an orthogonal transform producing a plurality ofcoefficients, quantizing said coefficients and coding the quantizedcoefficients by means of a variable length coding step in which the morefrequently occurring values are represented by shorter code lengths andthe less frequently occurring values by longer code lengths, saidvariable length coding step including a defining sub-step for generatinga set of codewords corresponding to said digital signals and in whichthe code used is built with the same length distribution L′=(n′_(i))[i=1, 2 . . . , l_(max)] as the binary Huffman code distributionL=(n_(i)) [i=1, 2 . . . , l_(max)], n_(i) being the number of codewordsof length i, and is constructed by implementation of the followingsteps: (a) creating a synchronization tree structure of the code withdecreasing depths for each elementary branch of said tree, withinitialized parameters D=l_(max), K=n_(lmax)/2 and currentl=l_(cur)=l_(max), the notations being: D=arbitrary integer representingthe maximum length of a string of zeros; l_(max)=the greatest codewordlength; K=arbitrary integer representing the maximum length of a stringof ones; n_(lmax)=number of codewords of length l_(max) in the Huffmancode; (b) for each length called l_(cur) beginning from l_(max), ifn′_(lcur)≠n_(lcur), using the codeword 1_(k) as prefix and anchor to itthe maximal size elementary branch of depth D′=l_(cur)−K; (c) if 1_(k)cannot be used as prefix, find a suitable prefix by choosing the minimallength codeword that is in excess with respect to the desireddistribution.
 3. A device for encoding digital signals, said devicecomprising at least an orthogonal transform module, applied to saidinput digital signals for producing a plurality of coefficients, aquantizer, coupled to said transform module for quantizing saidplurality of coefficients and a variable length coder, coupled to saidquantizer for coding said plurality of quantized coefficients inaccordance with a variable length coding algorithm and generating anencoded stream of data bits, said coefficient coding operation, in whichthe more frequently occurring values are represented by shorter codelengths and the less frequently occurring values by longer code lengths,including a defining sub-step for generating a set of codewordscorresponding to said digital signals and in which the code used isbuilt with the same length distribution L′=(n′_(i)) [i=1, 2 . . . ,l_(max)] as the binary Huffman code distribution L=(n_(i)) [i=1, 2 . . ., l_(max)], n_(i) being the number of codewords of length i, and isconstructed by implementation of the following steps: (a) creating asynchronization tree structure of the code with decreasing depths foreach elementary branch of said tree, with initialized parametersD=l_(max), K=n_(lmax)/2, and current l=l_(cur)=l_(max), the notationsbeing: D=arbitrary integer representing the maximum length of a stringof zeros; l_(max)=the greatest codeword length; K=arbitrary integerrepresenting the maximum length of a string of ones; n_(lmax)=number ofcodewords of length l_(max) in the Huffman code; (b) for each lengthl_(cur) beginning from l_(max), if n′_(lcur)≠n_(lcur), using thecodeword 1_(k) as prefix and anchor to it the maximal size elementarybranch of depth D′=l_(cur)−K; (c) if 1_(k) cannot be used as prefix,find a suitable prefix by choosing the minimal length codeword that isin excess with respect to the desired distribution.